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Abstract 

This paper describes a universal model 
for paraphrasing that transforms accord¬ 
ing to defined criteria. We showed 
that by using different criteria we could 
construct different kinds of paraphras¬ 
ing systems including one for answering 
questions, one for compressing sentences, 
one for polishing up, and one for trans¬ 
forming written language to spoken lan¬ 
guage. 

1 Introduction 

The term “paraphrasing” used in this paper 
means the process of rewriting sentences without 
altering the meaning. It includes generating easy 
sentences from difficult ones, polished sentences 
from broken or poor ones, or sentences used in 
spoken language from ones used in written lan¬ 
guage, which is useful for generating speech from 
written texts. Moreover, generating concise sen¬ 
tences that have almost the same meaning as their 
long, tedious original, which is classihed under 
summarization, is also a type of paraphrasing. 
Paraphrasing is useful for many natural-language 
processing techniques, and it is important for gen¬ 
erating these techniques. 

This paper describes a universal model that 
achieves many kinds of paraphrasing. It trans¬ 
forms sentences according to a predefined crite¬ 
ria. We show that our model can handle many 
kinds of paraphrases by selecting the most appro¬ 
priate criteria for each type of paraphrasing. Since 
this model includes several of criteria for different 
types of paraphrasing, it can deal with a whole 
range of paraphrasing types. 

2 Paraphrasing model 

Figure shows our paraphrasing model. It con¬ 
sists of two modules: a transformation and an 
evaluation module. A sentence that requires 


Figure 1: Our paraphrasing model 

transformation is input into the system. Several 
potential transformation types are generated in 
the transformation module and then tested in the 
evaluation module, where most appropriate type 
is selected. It is then used for the transformation 
and the result is output. 

• Transformation module 

This module generates the potential trans¬ 
formation types. They are based on hand¬ 
written rules, rules automatically detected by 
machines, or on a combination of both. 

• Evaluation module 

This module selects the most appropriate 
transformation type by using predefined cri¬ 
teria. The criteria needs to be adapted each 
time according to the particular problem it 
should handle. 

Here are several example criteria used in the 
evaluation module: 

• Similarity 

To establish the similarity between X and Y, 
we first suppose that all the rewriting rules in 
the transformation module comply with the 
restriction that the transformation does not 
alter the meaning. We then transform X and 
Y in the transformation module so that they 
are similar as possible. Then we calculate 
their similarity correctly even if X and Y ex¬ 
pressed the same meanings differently. 


• Length 

To compress a sentence without chang¬ 
ing the meaning, similarly to the sentence- 
compression process, which is classified under 
summarization, we again suppose that all the 
rewriting rules in the transformation module 
comply with the restriction that transforma¬ 
tion does not alter the meaning. Here, we use 
the length of the sentence as the main cri¬ 
terion of transformation. We can compress 
a sentence by repeatedly transforming input 
sentences to decrease their lengths. 


• Frequency (or probability of occurrence)]^ 

To polish poor or unconnected sentences also 
all the rewriting rules in the transforma¬ 
tion module have to comply with the re¬ 
striction that transformation does not al¬ 
ter the meaning. We transform sentences 
that require polishing according to how fre¬ 
quently part of these sentences appear in 
the corpora, so they can be transformed into 
more sophisticated ones. We can explain 
this by using an easier example: if the in¬ 
put data include the word “summarisation” 
and the transformation module has a func¬ 
tion that changes “summarisation” to “sum¬ 
marization,” we count how often both “sum¬ 
marisation” and “summarization” have so far 
appeared by using Penn Treebank or another 
corpus. When the frequency of “summariza¬ 
tion” exceeds that of “summarisation,” we 
change “summarisation” in the input data to 
“summarization.” 


Furthermore, we can use several types of cor¬ 
pora for measuring the frequency or calcu¬ 
lating the probability, and they have differ¬ 
ent results. For example, when input data 
in written language and corpora comprising 
of spoken language is used, the input data is 
converted into spoken language. Thus, if we 
have input data that include “is not” and the 
transformation module has a function that 
changes “is not” to “isn’t”, since “isn’t” oc¬ 
curs more frequently than “is not” in the spo¬ 
ken language corpus, “is not” is changed to 
“isn’t.” 


Now, suppose that the input data are sen¬ 
tences that have difficult expressions such as 
those used in legal documents. When we use 


^Probabilities of occurrence in corpora have been 
used in ma ny studies on spelling-error correction and 
generation l|Kukich, 1992 ; Brown et ah, 1993; Ratna- 
parkhi, 200(jy i 


a set of easy sentences for the corpora to mea¬ 
sure their frequency, the difficult sentences 
are transformed into easy ones. Or, suppose 
that the input data is a novel written by an 
unknown writer and a set of materials writ¬ 
ten by Shakespeare is used for the corpora 
that measures the frequency. In this case, a 
new novel in Shakespearean style would be 
output.^ 


• Judging the grammatical validity of a sen¬ 
tence 

Measuring the frequency can be used to pol¬ 
ish sentences; therefore, it can also be used 
to judge whether a sentence is grammatical 
or not. But when the criteria is too restrict¬ 
ing for establishing the grammatical validity, 
we can instead use only one of the following: 


The expressions used in the transformed 
version should occur at least once in the 
corpora. (This measure is often used in 
spell-check systems (Kukich, 1992).) 

The probability of occurrence in the cor¬ 
pora should exceed a certain threshold. 


— The probability of occurrence in the cor¬ 
pora should be higher than that when 
the surroundings are not used for calcu¬ 
lating the probabilities. 


The criteria we described here are more sim¬ 
ilar to conditions, and would be most effec¬ 
tive when combined with other criteria. We 
should use these criteria additionally when 
other criteria cannot guarantee the grammat¬ 
ical validation of a sentence in a transforma¬ 
tion. 


• Judging the equivalence in meaning of a sen¬ 
tence before and after transformation 

When we do not know whether the transfor¬ 
mation in the transformation module comply 
with the restriction that the transformation 
should not alter the meaning of a sentence, 
this criteria is required. However, we doubt 
that equivalence in meaning can be judged at 
all. 

For ad hoc solutions we can apply the follow¬ 
ing two methods, either separately or simul¬ 
taneously: 

^ Moreover, oiir mode l can handle machine transla¬ 
tion ( [Brown et al., 199^ by applying translation rules 
in the transformation module and using corpora writ¬ 
ten in the target language for calculating the proba¬ 
bilities. 





















Table 1: Sentences in the database (English translation of Japanese sentences) 

Usually, when a Japanese person hears that an American lives in New York, he or she thinks that 
the American lives in New York City. This is a common mistake, however. New York City takes up 
only a very small area of southern New York State. It takes about eight hours to drive from New 
York City to Niagara Falls, which is also in New York State. The majority of the state consists 
of mountains, forests, fields, rivers, lakes, and swamps. The people who live in these central and 
northern areas of the state usually live in small towns. Farming is the most common occupation 
among these New York State residents, and corn is the most common crop grown by them. 


— We check the transformation rules by 
hand and only use those that satisfy the 
meaning-equivalence criteria. And/or, 
we list the cases when the rules satisfy 
the meaning-equivalence by hand and 
the ones that do not, and then judge the 
meaning-equivalence by using that par¬ 
ticular data. 

— We extract only those rules that re¬ 
liably satisfy the meaning-equivalence 
when they are extracted automatically. 
And/or, we use a machine that learns 
the conditions when the rules satisfy the 
meaning-equivalence are also learned au¬ 
tomatically when rules are extracted au¬ 
tomatically. 

This item is more similar to a condition than 
a criteria, as in the previous item. It is used 
in addition to other measures]^ 

We can imagine other measures than the ones 
we described.^ 

In the following sections, we will demonstrate 
what kinds of criteria we applied for transforma¬ 
tion in our model by using concrete examples from 
our research. 


3 The case of a QA system 


Our question-answering system takes the follow¬ 
ing procedures]^ ^ 

®We can imagine further criteria for similar condi¬ 
tions, such as that the sentence should have at most 
seven phrases, pl us or minus t wo, whose rnodifiees are 
not deter mined (Miller, 1956; Yngve, 1960|; Murata et 
ah, 200l| ). 


If research on polite expressions or easily under¬ 
standable expressions can successfully produce crite¬ 
ria, it should enable automatic transformation to po¬ 
lite or easily understandable expressions. The same 
result can be achieved by using corpora that include 
only polite or only easily understandable sentences for 
measuring the probabilities of occurrences. 


^There are many studies on the Q A system (Ku- 
piec, 1993; PREC-S committee, 1999). 


^ur question-answering system uses original para¬ 
phrasing techniques. These techniques produce very 


Table 3: transformation rule used in the transfor- 
mation module _ 


X is Y 


Y is X 

general 


common 

these X residents 


these residents of X 

,x 




1. The system extracts sentences, including the 
answer of the question sentence, from sen¬ 
tences in the database. 


2. The system rewrites the extracted sentences 
and the question sentence so that they are as 
similar as possible. 


3. The system compares the rewritten sentences 
from the database to the rewritten question 
sentence. It then outputs the phrase in the 
rewritten sentence from the database that 
corresponds to the interrogative pronoun in 
the rewritten question sentence as the answer. 


For example, when we are given the data ( Ky- 
oukai, 1985 ) shown in Table j^, we are asked the 
question of “What is the most general occupa¬ 
tion among the residents of central and northern 
New York State?” Our system^ transforms the 
question sentence into a declarative sentence and 
the interrogative pronoun is changed to X. The 
sentence most similar to this question sentence is 
extracted, and the system takes the state shown 
in the first line of Table We suppose that our 
system applies the transformation rule shown in 
Table The question sentence and the extracted 
sentences are rewritten to increase the similarity 
between them. Finally, as shown in the table, the 
similarity reaches a maximum level of 219.5 and 
cannot be increased. At this stage, the system 


accurate quest ion-answering results because the sys¬ 
tem detects answers when the transformed question 
sentence and the transformed data sentence are very 
similar. 

^Our actual system is in Japanese and the example 
sentence in this paper is the English translation of the 
Japanese sentence. 



















Table 2: Examples of the QA system 


Sim. 


Sentences 

32.1 

quest. 

The most general occupation among the residents of central and northern New 
York State is X. 

32.1 

data 

Farming is the most common occupation among these New York State residents, 
and corn is the most common crop grown by them. 

103.1 

data 

Farming is the most general occupation among these New York State residents, 
and corn is the most common crop grown by them. 

82.5 

data 

Farming is the most common occupation among these residents of New York State, 
and corn is the most common crop grown by them. 

186.5 

data 

Farming is the most general occupation among these residents of New York State, 
and corn is the most common crop grown by them. 




219.5 

quest. 

X is the most general occupation among the residents of central and northern New 
York State. 

219.5 

data 

Farming is the most general occupation among these residents of New York State 


Ans. 

= Farming 


Sup. 

= Farming is the most general occupation among these residents of New York 
State 


compares the sentence from the database and the 
question sentence and detects “Farming” easily by 
extracting the phrases in the sentence from the 
database that correspond to X. 

Our QA system paraphrases by using similarity 
as a criterion. Because the system paraphrases 
sentences to increase their similarity, it facilitates 
comparing the question sentence and the data sen¬ 
tence. Here, we showed the QA system as an ex¬ 
ample. The similarity criterion can also be used 
for most cases of calculating similarity. For exam¬ 
ple, a high-level information retrieval system could 
determine the similarity of a query and a retrieval 
document after they are rewritten to ensure their 
similarity is the highest possible.^ 


In an anaphora resolution (Murata and Nagao 


199S), the system cannot resolve the anaphora when 
the identity or inclusion-relationship of “hole” ex¬ 
pressed by “a hole which is at the base of a huge 
cedar tree nearby” and “the hole at the base of the 
cedar tree” cannot be judged. But when we rewrite 
them based on the similarity criterion and obtain “a 
hole at the base of a huge cedar tree nearby” and “the 
hole at the base of the cedar tree,” the system under¬ 
stands that the former expression includes the latter 
one (that the two trees, “a huge cedar tree nearby” 
and “the cedar tree,” are the same is determined eas¬ 
ily because the former “tree” only has additional ad¬ 
jectives), and that the latter one refers to the former 


4 The case of a sentence 
compression system 


Recently, many studies on summarization have 
been conducted. We also performed sentence com¬ 
pression ( Knight and Marcu, 2000| ), which is clas¬ 
sified under summarization. 

We researched automatic extraction of rewrit¬ 
ing rules from two different dictionaries. Here is a 
brief explanation of our research. Two Japanese 
dictionaries gave the definitions shown in Figure 
^ for the word abekobe meaning “reverse”. We ex¬ 
pected to extract the pairs of expressions having 
the same meaning by comparing the two defini¬ 
tions, since they both defined the same word and 
thus had the same meaning. We compared the two 
by using the unix command “diff” and obtain the 
results shown in the figure. From the results, we 


terchangeable, as well as sakasama-ni irekawatte 
“be changed upside-down” and hikkuri-kaet “be 
overturned”. We actually obtained 67,632 rewrit¬ 
ing rules by using this method. However, they also 
included many incorrect ones. So we automat¬ 
ically selected rewriting rules that appear more 
than once in the comparison, because rewriting 
rules that appear twice or more in the compari¬ 
son are more accurate. n The number of selected 


®In our actual experiments, we used probabilistic 
equations to detect rules. We cannot explain this 
method in detail in this paper. But the results would 


















Definition of “reverse” in Dictionary A: 


junjo , ichi nado -no kankei -ga sakasama-ni irekawat -teiru 

(order) (,) (location) (etc.) (of) (relation) nom (upside-down) (change places) (-ing) 
(The relationship of the order, location and so on is changed upside-down.) 


Definition of “reverse” in Dictionary B: 


junjo , ichi , kankei -ga hikkuri-kaet 

(order) (,) (location) (,) (relation) nom (be overturned) 

(The relationship of the order and location is overturned.) 


-teiru 

(-ing) 


Results of comparing the two definitions 

junjo , ichi 

nado -no 
(etc.) (of) 

5 

5 

kankei 


-ga 


sakasama-ni 
(upside-down) 
hikkuri-kaet 
(be overturned) 


irekawat 
(be changed) 


-teiru 


Figure 2: Example of extracting rules for paraphrasing 


rules was 775. The research in this section uses 
these 775 rules as the rules in the transformation 
module. 

Here we considered summarizing newspaper ar¬ 
ticles and used the following criteria in the evalu¬ 
ation module. 

• The transformed sentence should be shorter 
as possible. 

• The expressions in the transformed sentence 
should appear at least once in the corpora, 
which contained two-years’ worth of newspa¬ 
per articles, to verify the grammatical valid¬ 
ity of a sentence. 


string of the /c-gram morphemes just be¬ 
fore S as S'li and to the one just after as 
S2,. 

(b) The system counted the number of the 
strings reduced when string Ai was 
changed to Bi against each Bi. We re¬ 
ferred to the i when the value was the 
highest as m. 

(c) The system counted the frequency of the 
string of SlmBmS2m in the corpus used 
in the evaluation module. When it oc¬ 
curred at least once, the system trans¬ 
formed Am to Bm and performed the 
procedure on the next morphology. 


Strictly speaking, we used the following proce¬ 
dures. 


1. The system analyzed an input sentence mor¬ 
phologically by using the Japanese morpho¬ 
logical analyzer JUMAN ( Kurohashi and Na- 


gao, 19981) and divided it into a string of mor¬ 


phemes. 


2. The system performed the following proce¬ 
dures for each morpheme from left to right. 

(a) When the string of morphemes S', whose 
first morpheme is the current one (in¬ 
cluding no morphemes, e.g., “”) matched 
the Ai string from the transformation 
rule Ri {Ai Bi), the Bi string was 
extracted as the candidate of the trans¬ 
formed expression. We referred to the 

be similar to the results for detecting rules appearing 
more than once. 


where, fc is a constant. 

Here, we used a simple method using k-gram 
as the environments for calculating to facilitate 
the experiments. Since we used a simple method, 
we set A: at 2 to increase the precision rate and 
decrease the recall rate. 

We carried out the experiments on sentence 
compression using newspaper articles. An exam¬ 
ple of the results are shown in Table The un¬ 
derlined part is the part that was removed in the 
transformation. Because this section focused on 
sentence compression, transformation rules to re¬ 
move strings were frequently used. The “from,” 
“of flow,” and teki were appropriately deleted, 
which succeeded in compressing the sentences. 
But the results also included faulty deletions of 
to “and,” and surukoto (do). Omitting to changes 
the original meaning of jiyuu to minshushugi, “lib¬ 
erty and democracy”. Omitting surukoto (do) 
changes a verb youritsu (support) into a noun, 
and the phrase X-san wo “Mr. X” is missing a 






















Table 4: Example of sentence compression 
Example of correctly transformed results 
kokonoka -kara -no kankoku houmon -dewa 

(9tti) (from) (of) (Korea) (visit) (in) 

(in the visit to Korea of from the 9th) 

rekishi -no nagare -no naka -de 

(history) (of) (flow) (of) (middle) (in) 

(in the middle of the flow of history) 

juuoku doru no tuika teki sochi 

(a billion) (dollar) (of) (supplement) (-ary) (step) 

(a supplementary step of one billion dollars) 

Example of incorrectly transformed results 
jiyuu to minshushugi 

(liberty) and (democracy) 

(liberty and democracy) 

'K.-san wo kouho to-shite youritsu suru koto wo kimeta 

(Mr. X) obj (candidate) (as) (support) (do) obj (decide) 

(decided to support Mr. X as the candidate) 


verb. The sentence is therefore not grammatical. 
To correct these errors requires using a new eval¬ 
uation method that includes syntactic features. 

We emphasize that we can compress sentences 
by using the sentence length as a criterion in the 
evaluation module, as our experiments confirmed. 

5 The case of a 

sentence-polishing-up system 

In this section we describe a sentence-polishing- 
up system. We used the same 775 transformation 
rules as in the previous section. 

Here, we tried to polish sentences from newspa¬ 
per articles and applied the following criteria in 
the evaluation module. 

• The substrings of the transformed sentence 
should occur frequently in the corpora, which 
contained two-years’ worth of the news-paper 
articles.|^ 

We used the procedures by which we changed 
part of 2c in the procedures of Section || to the 
following: 

need to use a corpus/corpora with the same 
subject as the target sentence to polish it up accord¬ 
ingly. 

^^The research in this section i.s .similar to t hat on 
the spelling or word correction ( [Kukich, 1992| ). But 
in the cases of spelling or word correction, ungram¬ 
matical sentences are input. In contrast, in sentence- 
polishing-up systems, grammatical sentences are in¬ 
put and the systems make them more sophisticated. 
Performing such sentence-polishing-up without the 
rewriting rules as we automatically extract is difficult. 


(c) The system counts the frequencies of the 
strings of and S^rnddmS^^rn i^ 

the corpus used in the evaluation mod¬ 
ule. When the number of the frequency of 
SlmBmS2m excceds that of the 

system transforms Am to Bm and performs 
the procedure of the next morphology. 

In this case, too, we used k=2.|^ 

We carried out the sentence-polishing experi¬ 
ments using newspaper articles. Some results are 
shown in Table The lower strings are the trans¬ 
formed ones. The results include sentences that 
were made more comprehensible by adding ya 
“and” and no “of”, and others where mo, mean¬ 
ing “no more than” was added incorrectly. The 
latter transformation changed the meaning of the 
sentence. In another case the tense was changed 
incorrectly form shita “did” to suru “do”. These 
errors resulted from errors in the automatic selec¬ 
tion of the transformation rules. 

We want to stress that we can perform multiple 
kinds of paraphrasing by using various types of 
measures for an evaluation module. The results 

To improve the results of transformation, the fre¬ 
quency of each string x in our procedures must be 
changed to the probability of occurrence of x in the 
corpora when the given input data is used as the con¬ 
text. Although our procedures use the fixed k * 2 
morphemes of “in front” and “behind” as the con¬ 
text, we should calculate the probabilities by using 
the variable-length context and more global informa¬ 
tion, such as syntactic information and tense informa¬ 
tion, in the powerful probability-estimator such as the 
maximum entropy method. 
















Table 5: Examples of results of sentence¬ 


polishing-up 


Example of correctly transformed results 
kazoku , yuujin-ra to sugosu 

(family) (,) (friends) (with) (live) 

ya 
and 

(lives with her family, (, ^ and) her friends.) 


shijiritsu kaifuku 

(support rates) _ (recovery) 


(of) 

(support rates recovery (= 


recovery of supports rates)) 


Example of wrongly transformed results 


8 pointo 

8 (points) 

’ -mo 

(no less than) 

(exceed 8 points) 


uwamawan 

exceed 


sakunen -wo shoutyou shita 

(last year) obj (symbolize) (did) 

sum 

(do) 

(symbolized last year) 


of this section included cases where the system 
enhanced the sentence lengths and modified the 
input sentences to be more comprehensible; the 
results were different from the ones obtained in 
the compression tests. These results thus confirm 
our assertation. 

6 The case of a written-language- 
to-spoken-language 
transformation system 

Here, we tried to transform the sentences 
in written-language style to those in spoken- 
language style. 

We carried out new experiments for extracting 
the rules that transform strings from written to 
spoken language. These experiments were per¬ 
formed by using the same method as in Section 
^ comparing the parallel corpora of written and 
spoken language. Our institution has been com¬ 
piling these corpora. The written-language sen¬ 
tences are taken from academic papers and the 
spoken-language ones from their oral presenta¬ 
tions. We obtained 72,835 rewriting rules from 
these experiments, but many contain the same in¬ 
correct rewriting rules as in Section We thus 
automatically selected 240 rewriting rules that ap¬ 
pear more than once. This section uses these rules 
in the transformation module.!^ 

We tried using the 775 rules in Section ^ in addi¬ 
tion to these 240 rules for the experiments of this sec- 


We used the following criteria in the evaluation 
module. 

• The substrings of the transformed sentence 
should occur frequently in the spoke-language 
corpora.]^ 

We followed the same procedure as in the pre¬ 
vious section. 

We only changed from using a newspaper cor¬ 
pora to a spoken-language corpora. In the previ¬ 
ous section, because the subject of the input data 
was same as that of the corpora used in the eval¬ 
uation module, the system made the newspaper 
articles more similar in style to newspaper arti¬ 
cles by polishing up. In contrast, in this section, 
because the input data was in written language 
and the corpus used in the evaluation module was 
spoken language, the data was made transformed 
from written to spoken language.^ 

We input sentences from one of our papers as 
experiments. The results are shown in Table |^. 
In these experiments, no incorrect transformation 
occurred. Ma is a filler and roughly means “so- 
so.” It is often used in spoken Japanese. Toiu 
means “that is” and is also used often in spo¬ 
ken Japanese. The transformed expressions in the 
table adequately produced the nuance of spoken 
Japanese. However, only a few transformed ex¬ 
pressions were obtained and the transformation 
recall rate was low. Therefore, we need to im¬ 
prove the system (Footnote . 

We want to reiterate that we can perform mul¬ 
tiple kinds of paraphrasing by applying various 
types of criteria in the evaluation module. In these 
experiments, we obtained expressions used in spo¬ 
ken language. So the results are sufficient to con¬ 
firm our claim. 

7 Conclusion 

We demonstrated that a method that transforms 
sentences based on certain criteria can be used as 
a universal model for paraphrasing. We showed 

tion. However, the results were worse than if we had 
not used them. This is because the 775 rules included 
faulty transformation rules such as sum shita “do 
=> did”. We believe that if the 775 rules had not in¬ 
cluded such wrong rules, we could have used them as 
well for the experiments in this section. 

^■^These corpora include 330,679 Japanese charac¬ 
ters. 

^®We have to use the same-domain corpora with the 
domain of the target to which input data are trans¬ 
formed. 

^®The research of this secti on is very similar to sta¬ 
tistical machine translation ( Brown et ah, 1993| ). In 
this section, the source language is written language 
and the target language is spoken language. 











Table 6: Examples of transformation from written to spoken language 


sono teigi -wo riyou suru toiu-koto -ga 


kangaerareru 



(its) (definition) obj (use) (do) (that) obj 

ma 

filler 

(think) 




(We can think (ma) that we use its dehnition.) 






dougi hyougen -wo tyushutsu suru 



koto 

-WO 

kokoromiru. 

(same-meaning) (expression) obj (extract) (do) 

toiu 


(that) 

obj 

(try) 


(that is/such) 




(We try (such) that we extract the same-meaning expressions.) 





hindo -de souto -shita kekka 


-WO 

hyou 

-ni 

shimesu. 

(frequency) (by) (sort) (did) (result) 


obj 

(table) 

(in) 

(show) 

toiu 

-no 





(that is) 

(those) 





(We show (those that is) the results sorted by frequency in 

the table.) 





four ways our systems can apply this simple model 
and confirmed that by using different criteria we 
could construct different systems, including ques¬ 
tion answering, sentence compression, sentence 
polishing-up, and written-language to spoken- 
language transformation. 

Implementing various types of paraphrases us¬ 
ing a universal model has the following advan¬ 
tages: 

• Since some components in the model can be 
used in different types of systems, we can use 
them to construct more systems after having 
constructed one. 

• We can construct a new paraphrasing system 
by merely changing a small part of an exist¬ 
ing system (e.g. the criteria in the evaluation 
module). Therefore, we can construct new 
paraphrasing systems very easily. 

In the future, we hope to construct more types 
of paraphrasing systems by using our universal 
model. 
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